Mixed Integer Linear Programming for Exact Finite-Horizon Planning in Decentralized Pomdps
نویسندگان
چکیده
We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n ≥ 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent’s policy in the sequence-form and not in the treeform, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the DecPomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark DecPomdp problems than existing algorithms. For example, the multiagent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it.
منابع مشابه
An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs
Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs). Although DEC-POMDPS are a general and powerful modeling tool, solving them is a task with an overwhelming complexity that can be doubly exponential. In this paper...
متن کاملMemory-Bounded Dynamic Programming for DEC-POMDPs
Decentralized decision making under uncertainty has been shown to be intractable when each agent has different partial information about the domain. Thus, improving the applicability and scalability of planning algorithms is an important challenge. We present the first memory-bounded dynamic programming algorithm for finite-horizon decentralized POMDPs. A set of heuristics is used to identify r...
متن کاملDual Formulations for Optimizing Dec-POMDP Controllers
Decentralized POMDP is an expressive model for multiagent planning. Finite-state controllers (FSCs)—often used to represent policies for infinite-horizon problems—offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for...
متن کاملUsing linear programming duality for solving finite horizon Dec-POMDPs
This paper studies the problem of finding an optimal finite horizon joint policy for a decentralized partially observable Markov decision process (Dec-POMDP). We present a new algorithm for finding an optimal joint policy. The algorithm is based on the fact that the necessary condition for a joint policy to be optimal is that it be locally optimal (that is, a Nash equilibrium). Through the appl...
متن کاملExact Mixed Integer Programming for Integrated Scheduling and Process Planning in Flexible Environment
This paper presented a mixed integer programming for integrated scheduling and process planning. The presented process plan included some orders with precedence relations similar to Multiple Traveling Salesman Problem (MTSP), which was categorized as an NP-hard problem. These types of problems are also called advanced planning because of simultaneously determining the appropriate sequence and m...
متن کامل